-
Notifications
You must be signed in to change notification settings - Fork 310
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Distributed Sampling in cuGraph-PyG #4384
Conversation
…to bug_post_processing
…to bug_post_processing
…to bug_post_processing
…to bug_post_processing
…to bug_post_processing
…to bug_post_processing
…to bug_post_processing
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Approved with some comments and questions. The overall logic looks reasonable and trust you with matching the detailed interface of PyG's native Graph- and FeatureStore.
sz = torch.tensor(num_vertices[vtype], device="cuda") | ||
torch.distributed.all_reduce(sz, op=torch.distributed.ReduceOp.MAX) | ||
num_vertices[vtype] = int(sz) | ||
return num_vertices |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we cache num_vertices
and num_edges
since they require collective communications?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We cache the offsets, which are more frequently used, but I might add caching the vertex and edge counts if they are used more often. So maybe in the future.
/merge |
CI failed with |
/merge |
Reimplements the WG feature store for PyG using the `FeatureStore` interface. Merge after #4384 Closes rapidsai/wholegraph#47 Closes #4399 Authors: - Alex Barghi (https://github.com/alexbarghi-nv) - Seunghwa Kang (https://github.com/seunghwak) - Tingyu Wang (https://github.com/tingyu66) - Ralph Liu (https://github.com/nv-rliu) Approvers: - Tingyu Wang (https://github.com/tingyu66) - Vibhu Jawa (https://github.com/VibhuJawa) - Brad Rees (https://github.com/BradReesWork) - Ray Douglass (https://github.com/raydouglass) URL: #4432
Distributed sampling in cuGraph-PyG. Also renames the existing API to clarify that it is dask based.
Adds a dependency on
tensordict
forcuGraph-PyG
which supports the newTensorDictFeatureStore
.Also no longer installs
torch-cluster
andtorch-spline-conv
in CI for testing since that results in anImportError
and neither of those packages are needed.Requires PyG 2.5. Should be merged after #4335
Merge after #4355
Closes #4248
Closes #4249
Closes #3383
Closes #3942
Closes #3836
Closes #4202
Closes #4051
Closes #4326
Closes #4252
Partially addresses #3805